AITopics | response 1

Collaborating Authors

response 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

39555391eb0624a439c5131b1bb8a2e0-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 22:23:32 GMT

dependence, hanin and sellke, miller and hardt, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

0e7c7d6c41c76b9ee6445ae01cc0181d-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 11:27:51 GMT

final version, reviewer, section 3, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

Wang, Zhilin, Zeng, Jiaqi, Delalleau, Olivier, Shin, Hoo-Chang, Soares, Felipe, Bukharin, Alexander, Evans, Ellie, Dong, Yi, Kuchaiev, Oleksii

arXiv.org Artificial IntelligenceOct-27-2025

Preference datasets are essential for training general-domain, instruction-following language models with Reinforcement Learning from Human Feedback (RLHF). Each subsequent data release raises expectations for future data collection, meaning there is a constant need to advance the quality and diversity of openly available preference data. To address this need, we introduce HelpSteer3-Preference, a permissively licensed (CC-BY-4.0), high-quality, human-annotated preference dataset comprising of over 40,000 samples. These samples span diverse real-world applications of large language models (LLMs), including tasks relating to STEM, coding and multilingual scenarios. Using HelpSteer3-Preference, we train Reward Models (RMs) that achieve top performance on RM-Bench (82.4%) and JudgeBench (73.7%). This represents a substantial improvement (~10% absolute) over the previously best-reported results from existing RMs. We demonstrate HelpSteer3-Preference can also be applied to train Generative RMs and how policy models can be aligned with RLHF using our RMs. Dataset (CC-BY-4.0): https://huggingface.co/datasets/nvidia/HelpSteer3#preference Models (NVIDIA Open Model): https://huggingface.co/collections/nvidia/reward-models-68377c5955575f71fcc7a2a3

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2505.11475

Country:

Europe (0.67)
North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology (1.00)
Education > Educational Setting > K-12 Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

incorporating the reviewers ' suggestions. 2 Response to Reviewer # 1 3 Comment 1: " The significance of the proposed method is not very clear "

Neural Information Processing SystemsOct-3-2025, 07:52:28 GMT

We greatly appreciate the reviewers' effort and helpful comments. Comment 1: "The significance of the proposed method is not very clear..." It also has great theoretical significance in the optimization area. Though the convergence rate of this method could be suboptimal, it's a practical way to In addition, [6] shows some examples of saddle point algorithms where projection onto the constrain sets is hard. Comment 2: "Why do we consider nuclear norm constraint for this classification problem?" We find that this paper does not have section 5.4 and 5.6.

algorithm, comment 1, reviewer, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.36)

Add feedback

1 Overall Response 1 We sincerely thank all the reviewers for their helpful comments and constructive suggestions

Neural Information Processing SystemsOct-3-2025, 05:56:59 GMT

We sincerely thank all the reviewers for their helpful comments and constructive suggestions. Actually, the motivation of using attention and distillation differs from their origins. Regarding the comparison with related work. Net[C2] and NestedNet[C3] (results taken from their papers). The proposed method is robust to hyper-parameters.

artificial intelligence, classifier, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

39555391eb0624a439c5131b1bb8a2e0-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 13:24:20 GMT

artificial intelligence, hanin and sellke, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

provide two responses to the common concerns raised by the reviewers, and then reply each reviewer, respectively

Neural Information Processing SystemsOct-2-2025, 02:11:21 GMT

We would like to thank all the reviewers for your helpful comments and suggestions. As shown in Appendix A.3, the layer-wise GCN network has the highest computational complexity in the computational propagation flow. Please see the response in Common Response 2 . For fair comparison we only report the result on semi-supervised task. Please see the response in Common Response 2 .

artificial intelligence, machine learning, reviewer, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback

The Pursuit of Empathy: Evaluating Small Language Models for PTSD Dialogue Support

BN, Suhas, Mahajan, Yash, Mattioli, Dominik, Sherrill, Andrew M., Arriaga, Rosa I., Wiese, Chris W., Abdullah, Saeed

arXiv.org Artificial IntelligenceSep-23-2025

This paper investigates the capacity of small language models (0.5B-5B parameters) to generate empathetic responses for individuals with PTSD. We introduce Trauma-Informed Dialogue for Empathy (TIDE), a novel dataset comprising 10,000 two-turn conversations across 500 diverse, clinically-grounded PTSD personas (https://huggingface.co/datasets/yenopoya/TIDE). Using frontier model outputs as ground truth, we evaluate eight small LLMs in zero-shot settings and after fine-tuning. Fine-tuning enhances empathetic capabilities, improving cosine similarity and perceived empathy, although gains vary across emotional scenarios and smaller models exhibit a "knowledge transfer ceiling." As expected, Claude Sonnet 3.5 consistently outperforms all models, but surprisingly, the smaller models often approach human-rated empathy levels. Demographic analyses showed that older adults favored responses that validated distress before offering support (p = .004), while graduate-educated users preferred emotionally layered replies in specific scenarios. Gender-based differences were minimal (p > 0.15), suggesting the feasibility of broadly empathetic model designs. This work offers insights into building resource-efficient, emotionally intelligent systems for mental health support.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.15065

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.67)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

R3: Robust Rubric-Agnostic Reward Models

Anugraha, David, Tang, Zilu, Miranda, Lester James V., Zhao, Hanyang, Farhansyah, Mohammad Rifqi, Kuwanto, Garry, Wijaya, Derry, Winata, Genta Indra

arXiv.org Artificial IntelligenceSep-23-2025

Reward models are essential for aligning language model outputs with human preferences, yet existing approaches often lack both controllability and interpretability. These models are typically optimized for narrow objectives, limiting their generalizability to broader downstream tasks. Moreover, their scalar outputs are difficult to interpret without contextual reasoning. To address these limitations, we introduce $\shortmethodname$, a novel reward modeling framework that is rubric-agnostic, generalizable across evaluation dimensions, and provides interpretable, reasoned score assignments. $\shortmethodname$ enables more transparent and flexible evaluation of language models, supporting robust alignment with diverse human values and use cases. Our models, data, and code are available as open source at https://github.com/rubricreward/r3.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.13388

Country:

Asia (0.45)
North America (0.28)

Genre: Research Report > New Finding (0.92)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Incentivizing High-Quality Human Annotations with Golden Questions

Liu, Shang, Cai, Zhongze, Wang, Hanzhao, Ma, Zhongyao, Li, Xiaocheng

arXiv.org Machine LearningMay-27-2025

Human-annotated data plays a vital role in training large language models (LLMs), such as supervised fine-tuning and human preference alignment. However, it is not guaranteed that paid human annotators produce high-quality data. In this paper, we study how to incentivize human annotators to do so. We start from a principal-agent model to model the dynamics between the company (the principal) and the annotator (the agent), where the principal can only monitor the annotation quality by examining $n$ samples. We investigate the maximum likelihood estimators (MLE) and the corresponding hypothesis testing to incentivize annotators: the agent is given a bonus if the MLE passes the test. By analyzing the variance of the outcome, we show that the strategic behavior of the agent makes the hypothesis testing very different from traditional ones: Unlike the exponential rate proved by the large deviation theory, the principal-agent model's hypothesis testing rate is of $Θ(1/\sqrt{n \log n})$. Our theory implies two criteria for the \emph{golden questions} to monitor the performance of the annotators: they should be of (1) high certainty and (2) similar format to normal ones. In that light, we select a set of golden questions in human preference data. By doing incentive-compatible experiments, we find out that the annotators' behavior is better revealed by those golden questions, compared to traditional survey techniques such as instructed manipulation checks.

annotator, large language model, machine learning, (20 more...)

arXiv.org Machine Learning

2505.19134

Country: